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Abstract 

We explore the use of a sufficient statistic based on the identified members that 
are obtained for samples that are selected under the Mq capture-recapture closed 
population model (Schwarz and Seber, 1999). A Rao-Blackwellized version of the 
estimator based on a sufficient statistic is then presented. We explore the efficiency 
of the improved estimator via a simulation study. The R code for the simulation is 
provided in the appendix. 
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1 Introduction 



We shall consider the Mq model (Schwarz and Seber, 1999) where the probability of 
capture for any individual i on sampling occasion k is Pik — p for all i — 1,2, ...,N 
and k = 1,2, K where N is the population size and K is the number of sampling 
occasions. The data that we collect from the samples is do — {sqi, So2, Sqk} where 
sok refers to those members of the population that are selected for sample k. For all 
k — 1,2, K, we shall let n^k = \sok\-, and for any subset C ^ {1, 2, K} we will 
let mc = I n -soitl- 

keC 

2 Estimation 

K K 

We shall define the reduced data dr to be = {sq, ^ ^^ofc} where Sq = IJ Sq^- We 

k=l k=l 

shall define a reordering of the data to be consistent with dr if the reduced data from 
this reordering coincides with that of dr. Hence, a reordering of the original sample 

data is consistent with the reduced data if it consists of all Uq = \so \ members (that is, 

K 

where each member is selected for at least one sample) and a total of ^ nok members 

k=l 

are selected over all sampling occasions. Notice that sample reorderings that are 
consistent with the reduced data can contain samples whose sizes are different from 
the original sample sizes. 

We shall let R consist of all of the reorderings of the original data that are consistent 
with the reduced data. Now, suppose Nq is an estimate of the population size. For 
example in a two-sample study this estimator could be the bias-adjusted Lincoln- 
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Petersen estimator (Chapman, 1951), Nq = ^""In^^^^'^^^^ — 1- For each reordering 
i e Rwe shall let ■* be the corresponding reordered sample data (where the reduced 
data corresponding with dg"* is dr), Nq^ shall be the estimate of the population size 
obtained with reordering i, and tIqI shall be the number of individuals selected for 
sample k under reordering i where k = 1,2, K. The Rao-Blackwellized version of 
the preliminary estimator Nq is 



ieR 

_ ieR 



OK 



ieR 

X) N^^P^'o^ (1 - p)^-"oip42 (1 _ . . . _ p^N-nf^ 
ieR 

^ (1 _ (1 - p)^-«02 . . . p"0K (1 - p)N-r^'^ 

ieR 
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^p4i+42+-+"0K(l - p)KN-n'^l-n'^l-...-n'^l 
ieR 
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Notice that this estimator does not depend on the population size N (and that 
tIqI + UqI + ... + Hq]^ = no remains fixed over all reorderings) . Hence, dr is a sufficient 
statistic for N. Also notice that all sample reorderings are equally probable under 
the sufficient statistic. 
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As an example, we shall consider a two-sample study. Notice that when considering 
the sample reorderings for a two-sample study it is required that a total of noi + no2 
members be selected for the two samples. Hence, in order for all no members to 
be selected and for rioi + selections to be made, m^i 2} must remain fixed over 
the sample reorderings (since noi + no2 — = ^{1,2})- Therefore the number of 
reorderings that are consistent with the sufficient statistic is f ) x 2"'°~'"{i'2}. 
The reason for this is that m^i 2} members need to be selected for both samples and 
the other no — '^{1^2} can be placed in either of the 2 samples. A more compact 
version of the Rao-Blackwellized estimator, based on a two-sample study with the 
original data do = {soi, S02}, is 



n0-'"{l,2} 



fnp- m{i,2}\ f {k + m{i,2} + l){no - m{i,2} -k + m{i,2} + 1) ^ \ ^ 



k=0 
no-m{i 2} 



E 

k=0 



no - "^{1,2} 
k 



since there are ("° ^^^'^^) reorderings that correspond with the bias adjusted Lincoln- 
Petersen estimator 



{k + m{i,2} + l){no - m{i,2} -k + m{i,2} + 1) _ ^ , . 

^{1,2} + 1 



under the reduced data dr — {sq, rioi + ^02}- 
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3 Simulation Study 



Consider the following two-sample study where we set P = 0.2, 0.4, 0.6, and 0.8 and 
let the population size to range from 5 to 100. The following graph gives the ratio of 
the variances of the improved version of the bias-adjusted Lincoln-Petersen estimator 
to the preliminary estimator. For each dimension of the simulation, we took 250,000 
pairs of samples to overcome any Monte Carlo error. 



p=0.2 p=0.4 
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Figure 1: Plots of the ratio of the variances of the improved estimator and the 
preliminary estimator for P = 0.2, 0.4, 0.6, 0.8. 
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Larger improvements over the preliminary estimator can be expected with the im- 
proved estimator when the population size is small. The reason for this is that the 
expected number of individuals that are selected for each sample will reach conver- 
gence relatively quickly when the population size grows. Hence, estimators from 
reorderings that rely on sample sizes that will likely differ from the original sample 
sizes (that is, where k ^ or equivalently k ^ nQ — mi^2 in expression (|2|) will receive 
very little weight in the reorderings. Similarly, smaller values of p will likely yield 
more efficient improved estimators. The reason for this is that smaller values of p 
will give rise to smaller sample sizes and therefore more homogenous contributions 
are made from the sample reorderings for the improved estimator. 

4 Discussion 

In this manuscript we have presented a method for obtaining an improved estimator 
of the population size when it is assumed that the Mq model holds. We have also 
shown that greater gains in efficiency can be expected for when the population size 
is small and/or when the probability of capture of individuals for sampling occasions 
is small. 

Future work on extending this method to work with the other closed population 
models is deserving of future attention. 
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6 Appendix 



#This code performs a simulation study where it is assumed the M_0 model holds. 
#We use a two-sample study with the bias-adjusted Lincoln-Petersen estimator. 

N = 10 #The population size 

Sim = 1000 #The number of simulation runs 

p = 0.5 #The probability of capture in each sample 

N.LP = numeric () #The Preliminary Lincoln Petersen estimator, bias adjusted 
N.LP.RB = numeric #The Rao-Blackwellized estimator 

for(k in l:Sim) 
{ 

print (k) #To see the simulation run 

sOl = numeric #Selecting sample 1 

ul = runif(N,0,l) 

for(i in 1:N) 

if(ul[i] < p) 

sOl = union(s01,i) 

s02 = numeric #Selecting sample 2 
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u2 = runif(N,0,l) 
for(i in 1:N) 
if(u2[i] < p) 
s02 = union(s02,i) 

#Some details of the samples 
nOl = length(sOl) 
n02 = length (s02) 
nn = n01+n02 

sO = unique (union (s01,s02)) 
nO = length(sO) 
sl2 = intersect (s01,s02) 
m = length(sl2) 

N.LP[k] = (n01+l)*(n02+l)/(m+l)-l #The estimator 

#The RB part 
N.LP.rb = numeric () 
size. sample = numeric () 
choose. sum = numeric () 
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for(kk in 0: (nO-m)) 
{ 

N.LP.rb[kk+l] = ((m+kk+l)*(in+nO-in-kk+l)/(m+l)-l) 
choose, sum [kk+1] = choose (nO-ni,kk) 
s i z e . s amp 1 e [kk+ 1 ] = m+kk+m+nO -kk 
} 

N.LP.RB[k] = sum(choose.sum*N.LP.rb/sum(choose.sum)) #The final improved estimator 
} 

mean (N . LP) ; var (N . LP) 

mean (N . LP . RB) ; var (N . LP . RB) 

var(N.LP.RB)/var(N.LP) #The ratio of the variances 
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